Project Arbex (Arbitrary Expressions) #4777

anandthakker · 2017-06-02T19:41:59Z

Intent

Design and implement a style specification syntax for defining "style functions" using arbitrary expressions. Such functions should be usable for:

Style layer property values, wherever zoom functions and property functions are currently supported.
Style layer filters
[future] Intra-layer sorting and grouping of features Style spec: Add ability to sort features #4361

Background:

Initial discussion: #4715
Development of working draft: #4754

Important use cases that the final design should cover:

unit conversion
number formatting
- decimal precision
- thousands separator
conditionals (for the token defaults use case)
concatenation (for the token defaults use case)
customizable interpolation
arithmetic on feature properties within functions
localization

Plan

Benchmarks

Benchmark scores:

benchmark	master `45fde16`	feature/expressions 3fc994c
map-load	148 ms	121 ms
style-load	76 ms	132 ms
buffer	1,058 ms	1,091 ms
fps	60 fps	60 fps
frame-duration	6 ms, 0% > 16ms	8 ms, 1% > 16ms
query-point	0.85 ms	0.84 ms
query-box	51.25 ms	62.33 ms
geojson-setdata-small	7 ms	9 ms
geojson-setdata-large	135 ms	151 ms

Looking at the dds-specific benchmark being added in #5207, which exercises tile layout with many zoom-and-property functions, looks like about a 2x hit using expressions compared to existing functions:

update: After a couple quick fixes, it's more like 1.5x instead of 2x:

@mapbox/gl @mapbox/studio @mapbox/cartography-cats @mapbox/support

jfirebaugh · 2017-06-05T20:10:17Z

docs/style-spec/expressions.md

+
+###Decision:
+- `["case", cond1: Boolean, result1: T, cond2: Boolean, result2: T, ..., cond_m, result_m: T, result_otherwise: T] -> T`
+- `["match", x: T, a_1: T, y_1: U, a_2: T, y_2: U, ..., a_m: T, y_m: U, y_else: U]` - `a_1`, `a_2`, ... must be _literal_ values of type `T`.


Can we generalize this so that multiple literal values can be mapped to the same output expression?

Good idea 👍

Simplest syntax for this would be:

["match", x: T, [a_1: T, a_2: T, ... a_N: T], y_a: U, [b_1: T, b_2: T, ..., b_M], y_b: U, ..., y_else: U]

Only bad thing about this is that it would take us away from the purity of every array in the JSON representing an expression (since we could consider the curve types like ["linear"] to be expressions of an private CurveType type). Is maintaining such syntax purity worthwhile? Some ways we could do that:

["match", x: T, [ "vector", a_1: T, a_2: T, ... a_N: T], y_a: U, [ "vector", b_1: T, b_2: T, ..., b_M], y_b: U, ..., y_else: U]. But allowing a "vector" constructor may not be a great idea, since we don't have a way for users to specify arbitrary types.

We could use "json_array" instead -- but that's not great type-wise, since it specifically produces Vector<Value>, not Vector<T>.

A specialized "match_case" expression.

I think we should use the simplest syntax. It's fine if "match" is a special form. (case is in scheme too.)

jfirebaugh · 2017-06-05T20:11:39Z

docs/style-spec/expressions.md

+- `Boolean`
+  - Literal: `true` or `false`
+- `Color`
+- `JSONObject`


Let's just call it Object.

jfirebaugh · 2017-06-05T20:51:16Z

docs/style-spec/expressions.md

+  - TODO: without type inference, 0-length arrays and vectors can't be typed
+- `Value`: A "variant" type representing the set of possible values retrievable from a feature's `properties` object (`Null | String | Number | Boolean | JSONObject | Vector<Value>`)
+- `Error`: a subtype of all other types. Used wherever an expression is unable to return the appropriate supertype. Carries diagnostic information such as an explanatory message and the expression location where the Error value was generated.
+


What about enumerated types? I think we'll find that leaving them stringly-typed is less than ideal.

@jfirebaugh Can you say more about why you think we’ll want these over strings? specifically: are there cases where such types would be useful for checking what a subexpression evaluates to?

So far, the use cases I can think of for enums are for checking the final output of an expression that's being used for an enum-typed style property like symbol-placement. If it's true that this is the main reason for them, then the question is which is simpler/better: adding enums to the type system here, or treating the validation of style property values as a separate step?

Since there are, e.g., numeric properties like *-opacity that can't really be validated via types (at least not normal ones), seems like we're gonna have the extra validation step anyway, which makes me lean towards the latter.

Could be wrong, just a gut feeling that we'll run into issues in practice, or as we implement it (especially in native). I don't have any concrete examples.

jfirebaugh · 2017-06-05T20:52:50Z

docs/style-spec/expressions.md

+###Lookup:
+- `["get", obj: JSONObject, key: String ] -> Value`
+- `["has", obj: JSONObject, key: String ] -> Boolean`
+- `["at", arr: JSONArray, index: Number] -> Value`


This is redundant to the polymorphic version on the next line, right?

Whoops, yep thanks.

jfirebaugh · 2017-06-05T20:54:27Z

docs/style-spec/expressions.md

+- min, max
+
+###String:
+- `["concat", expr1: T, expr2: U, …] -> String`


["concat", expr: String, …] -> String?

I was thinking it might be okay, as a convenience, for concat to implicitly convert all of its arguments, on the premise that this should always be possible without error as long as the value isn't an Error.

Is there a particular reason to support implicit conversion for concat but not other expressions (+, &&, up/downcase, ...)?

Not a hugely compelling one... just the combination of:

Combining string literals, string property values, and numeric property values will be a very common use case (i.e. the equivalent of using {} token replacement for text-field

Automatic conversions to String will never fail for values that aren't already Error, so the explicit cast doesn't seem to me to add any value.

Would there by any advantage to having a string formatting function instead of concat?

On one hand it would add complexity because it would add function-specific syntax. On the other hand you could handle the entire "this is what I want the displayed text to be" in a single function, which might be more UI-friendly.

1ec5 · 2017-06-13T19:06:21Z

docs/style-spec/expressions.md

+}
+```
+
+## Property Expressions


This document refers exclusively to “expressions”, but the title of this PR refers to “arbitrary expressions”. To avoid confusion, we should settle on one name for the feature and eliminate the other name. As catchy as “Arbex” sounds, I prefer simply “expressions” for these reasons:

Studio users, who may be unfamiliar with the style specification’s evolution, may perceive “arbitrary” as a negative trait.

“Arbitrary expressions” would seem to imply a non-arbitrary expression feature that doesn’t exist.

“Expressions” aligns with common usage on the iOS platform and in other kinds of software.

/cc @dasulit

Ideally this will no longer be necessary after this happens: mapbox#4777

ivovandongen · 2017-07-04T07:13:51Z

@anandthakker Could we consider using a (formal) specification for the expression syntax? Something like ABNF would let us use existing parser generators like ANTLR and others to generate platform specific bindings and also parts of the core implementations and tests.

anandthakker · 2017-07-04T11:00:37Z

@ivovandongen Yeah, I've definitely been pondering this. There are some things that would make it challenging to specify expressions fully and formally:

It's a JSON 'syntax', not text, so I don't think existing methods (like BNF) would work.
Thus far, we've had no need to include a way to represent every type in the system. Types exist, but only in the implementation--not in the language itself. In order to formally specify each expression in a spec, we'd need to either introduce a serialized representation of types, so that we could specify, e.g., that the 'length' expression is typed (Array<T>, String) => Number.
Each expression has not only unique evaluation behavior, but also potentially custom parsing needs. That's not a dealbreaker for a formal spec, but it does limit how much useful code we can generate from such a spec.

Beyond that, it's also just more machinery in general. As such, when @jfirebaugh and I discussed this yesterday we landed on the notion that, at least for now, we could just use the module in GL JS that defines each expression (found at src/style-spec/function/definitions/index.js) as the "source of truth". Even though that module's first purpose is to actually provide the GL JS implementation of each expression, we can also use it to build scripts to generate docs and core / SDK code. If/when that becomes problematic, we can always revisit the question of moving the 'source of truth' into a more complete specification within the style spec reference JSON.

Ideally this will no longer be necessary after this happens: mapbox#4777

jfirebaugh · 2017-07-14T21:04:17Z

src/style-spec/function/expression.js

+                throw new ParsingError(key, `Expected string, but found ${typeof name} instead`);
+
+            if (context.definitions[name])
+                throw new ParsingError(key, `"${name}" is reserved, so it cannot not be used as a "let" binding.`);


Since this is getting compiled to function (name, ...) { ... }, we need to either prohibit JS keywords as well, or sanitize names, say by prefixing them with _.

👍 yep, tracking that on my list of tail work but forgot to add it to the top of this ticket. Adding now.

jfirebaugh · 2017-07-14T21:13:37Z

docs/style-spec/expressions.md

+- `[ "e" ] -> Number`
+
+### Variable binding:
+- `[ "let", name1: String, e1, name2: String, e2, ..., e_result ]` - Bind expression `e1` to the string `name1`, `e2` to `name2`, etc., before evaluating `e_result`.  The bound expressions may be referenced within `e_result` with `[ name1 ]`, `[ name2 ]`, etc. (E.g.: `["let", "a", 1, "b", ["number", ["get", "blah"]], [ "+", ["a"], ["b"] ]`.)


I don't think we want to use [name1] as the syntax for this -- that would mean that any new expression form we want to add in a future version would potentially break existing expressions that used that name as an identifier. Better to have an explicit form like ["var", name].

🤦‍♂️ yeah. ["ref", name1]? ["var", name1]? Less concise, but actually might allow us to avoid some special logic for var reference expressions.

jfirebaugh

Rather than extending LambdaExpression, the Expression subclasses for match, case, curve, and coalesce should directly implement Expression, providing custom parse, typecheck, and compile methods, and using structured member data rather than a generic args. For example, the member data for MatchExpression should be something like:

    input: Expression;
    cases: Array<[Array<string | number>, Expression]>;
    otherwise: Expression;

This will reduce the complexity of LambdaExpression and other parts of the codebase, likely removing the need for typename, nargs, isGeneric, and so on. At the same time, using more strictly typed member data will reveal edge cases that are not currently handled/tested. As an experiment, I sketched this out for match. Edge cases:

No arguments (["match"])
One argument (["match", foo])
Two arguments (["match", foo, z])
Null or boolean key (["match", foo, null, x, z]) -- should we allow this? It's better written withcase IMO.
Object or array key (["match", foo, {...}, x, [[...]], y, z])
Mixed key types (["match", foo, 0, x, "0", y, z], ["match", foo, [0, "0"], x, z]). This is currently allowed, which surprised me, but I guess it is useful for munging datasets that are inconsistently typed.
Mismatched input/key types (["match", 0, "0", x, "1", y, z])
Mismatched result types (["match", foo, 0, 0, 1, "1", false])
Repeated keys (["match", foo, 0, x, 0, y, z])
Non-integer keys (["match", foo, 0.333, x, z])

jfirebaugh · 2017-07-17T23:35:15Z

Continuing to look at how to remove TypeName from the Type variant list...

Besides match, case, curve, and coalesce, the expression forms with "special" typing rules are:

The forms that take a variable number of arguments of a single type: concat, +, *, &&, and ||. These are fairly easy to handle with a special case varargs signature type, a simplified version of nargs which only CompoundExpression#typecheck needs to handle (type Signature = Array<Type> | Varargs).
length: the array signature wants to match an array of any type. This seems like it should be handled by the array subtyping rules somehow.
at: the output type is the element type of the input array type. This is the closest to "inference" we seem to get. Maybe it should just have a custom typecheck override?

Another thing I noticed is that typecheck operates one level "deeper" than typing judgements are usually expressed. Typing judgements are usually of the form

let t1 be the type of subexpression a, t2 be the type of subexpression b, if <some predicate on t1 and t2>, then the type of [expr, a, b] is type t3

Note that there's no "expected" type input to this judgement: if you need to check that t3 is a subtype of some expected type, than you do that externally to checking expr itself. I think our typechecking code would be clearer if it was written like this.

anandthakker · 2017-07-18T00:21:28Z

@jfirebaugh 👍 I came to similar tentative conclusions re: Varargs and length. For at, I was thinking of two possibilities:

at: (Number, Array) => Value (Array being Array<Value>). As with length, array subtyping should allow any array here (though I think currently, Value might, incorrectly, fail to match Array<T> for T != Value).
Promote at to be a special form with custom typecheck.

typecheck operates one level "deeper" than typing judgements are usually expressed

Oh yeah, good point -- this was originally the case to help make type inference of generics more straightforward, but now that we're eliminating that need, we can simplify it 🙌

anandthakker · 2017-07-23T01:41:50Z

src/style-spec/function/definitions/index.js

+                    context.key,
+                    'The "zoom" expression may only be used as the input to a top-level "curve" expression.'
+                );
+            }


This check should happen in function/index.js rather than here, because this restriction on [zoom] only applies to expressions used for style property values (as opposed to, say, filters)

anandthakker · 2017-08-30T03:21:15Z

@mourner @jfirebaugh posted some benchmarks in the top of the ticket - looks like DDS is about 2x slower with expressions, but hopefully quick profiling will yield some low-hanging fixes that will reclaim some of that.

mourner · 2017-08-30T06:19:25Z

@anandthakker thanks! The frame-duration 6ms to 8ms change is also worth looking into.

ChrisLoer · 2017-08-30T16:30:27Z

The frame-duration 6ms to 8ms change is also worth looking into.

#5150 adds to frame-duration as well (because it's doing collision detection during rendering). It's mainly an issue for zooms/maps with very dense line labeling, which is kind of the worst case for collision detection. We should try to keep an eye on how the two PRs interact.

anandthakker · 2017-08-30T16:35:15Z

@mourner @ChrisLoer After 6f4a6c4 the impact on frame-duration is approximately 0.84 ms:

mourner · 2017-08-31T15:31:46Z

src/style-spec/function/compile.js

+
+        return {
+            result: 'success',
+            function: fn.bind(evaluationContext()),


might be worth trying a custom bind or an arrow fn — calls of natively binded functions are expensive in many browsers

mourner · 2017-08-31T15:37:00Z

src/style-spec/function/evaluation_context.js

+        } else if (expectedType.kind === 'Value') {
+            typeError = true;
+        } else if (expectedType.kind !== 'Array') {
+            typeError = typeof value !== expectedType.kind.toLowerCase();


nitpick: toLowerCase will be called on almost every get; might be preferable to make a lookup table for kinds instead

mourner · 2017-08-31T15:39:25Z

src/style-spec/function/evaluation_context.js

+            } catch (e) {
+                if (thunks.length === 0) throw e;
+            }
+        }


Could we do away with a simple for loop here? Here, we create an array from arguments, and then pop values one by one from it when we can do away with just iterating over them, which is much cheaper.

Ah, this is actually obsolete code after df245c1 - will remove.

Ports mapbox/mapbox-gl-js#4777 (and its several follow-ups)

gabimoncha · 2018-07-13T09:32:49Z

I see that issue #4820 Support contains, begins with, and ends with filter option for strings is referenced in this package. But I cannot figure out the syntax for the expression to filter strings that begin with a given value. Can anyone help me with it?

anandthakker added cross-platform 📺 Requires coordination with Mapbox GL Native (style specification, rendering tests, etc.) under development labels Jun 2, 2017

jfirebaugh reviewed Jun 5, 2017

View reviewed changes

anandthakker mentioned this pull request Jun 6, 2017

Allow null as a filter value #4410

Closed

1ec5 reviewed Jun 13, 2017

View reviewed changes

brandonbloom added a commit to brandonbloom/mapbox-gl-js that referenced this pull request Jun 14, 2017

Eval hack until arbitrary expressions are supported.

abac614

Ideally this will no longer be necessary after this happens: mapbox#4777

anandthakker mentioned this pull request Jun 19, 2017

JS arbex implementation: parse, typecheck, compile expressions #4841

Merged

19 tasks

1ec5 mentioned this pull request Jun 27, 2017

Reimplement MGLStyleValue atop NSExpression mapbox/mapbox-gl-native#8074

Closed

anandthakker force-pushed the feature/expressions branch 2 times, most recently from b75efe5 to c1c29cb Compare June 30, 2017 12:27

andrewharvey mentioned this pull request Jul 3, 2017

Property merging or calculated values #2671

Closed

brandonbloom added a commit to brandonbloom/mapbox-gl-js that referenced this pull request Jul 5, 2017

Eval hack until arbitrary expressions are supported.

e058a47

Ideally this will no longer be necessary after this happens: mapbox#4777

anandthakker mentioned this pull request Jul 6, 2017

Expressions mapbox/mapbox-gl-native#9439

Merged

10 tasks

jfirebaugh mentioned this pull request Jul 12, 2017

Respect 'base' in composite functions mapbox/mapbox-gl-native#8654

Closed

jfirebaugh reviewed Jul 14, 2017

View reviewed changes

jfirebaugh reviewed Jul 15, 2017

View reviewed changes

anandthakker mentioned this pull request Jul 21, 2017

Data Driven Styling Based on Feature Id #3804

Closed

anandthakker commented Jul 23, 2017

View reviewed changes

anandthakker force-pushed the feature/expressions branch from 2d82f1a to 3bc2561 Compare July 24, 2017 22:02

anandthakker force-pushed the feature/expressions branch 6 times, most recently from cab3b58 to db7c15c Compare August 5, 2017 03:12

anandthakker force-pushed the feature/expressions branch 3 times, most recently from 75fcd51 to 73fff34 Compare August 29, 2017 22:33

anandthakker force-pushed the feature/expressions branch from 6f4a6c4 to f6e716b Compare August 30, 2017 19:12

mourner reviewed Aug 31, 2017

View reviewed changes

anandthakker force-pushed the feature/expressions branch from 2ab1313 to d68949a Compare August 31, 2017 19:11

jfirebaugh approved these changes Aug 31, 2017

View reviewed changes

Anand Thakker added 4 commits August 31, 2017 15:39

Implement expressions

0728c07

Update render tests for expressions

5b2f59f

Add expression integration tests

e3d5271

Add expressions to style spec docs

9ee7362

anandthakker force-pushed the feature/expressions branch from d68949a to 9ee7362 Compare August 31, 2017 19:47

anandthakker merged commit 9ee7362 into master Aug 31, 2017

This was referenced Aug 31, 2017

Allow developers to specify a default/fallback value for a token when a feature property is undefined #4079

Closed

Reinstate stop function documentation #5223

Closed

mourner mentioned this pull request Sep 1, 2017

Recover stop functions performance after switching to expressions #5224

Closed

anandthakker mentioned this pull request Sep 1, 2017

Rationalize and simplify the taxonomy of functions #4154

Closed

mourner mentioned this pull request Sep 1, 2017

Cache evaluation context to speed up expressions #5225

Merged

ChrisLoer mentioned this pull request Sep 5, 2017

Viewport collision detection #5150

Merged

18 tasks

1ec5 mentioned this pull request Sep 10, 2017

Add "Linear" MGLInterpolationMode mapbox/mapbox-gl-native#8457

Closed

anandthakker deleted the feature/expressions branch September 16, 2017 01:23

waissbluth mentioned this pull request Oct 25, 2017

Support nested objects and arrays for GeoJSON features in query*Features #2434

Open

anandthakker added a commit to mapbox/mapbox-gl-native that referenced this pull request Nov 8, 2017

Implement Expressions (#9439)

f648cfe

Ports mapbox/mapbox-gl-js#4777 (and its several follow-ups)

NicolasMinghetti mentioned this pull request Nov 9, 2017

map.getStyle() after map.setFilter(id, null) returns invalid Style JSON #5637

Closed

jfirebaugh mentioned this pull request Dec 5, 2017

Convert functions to expressions; add interpolation color spaces #5812

Closed

Project Arbex (Arbitrary Expressions) #4777

Project Arbex (Arbitrary Expressions) #4777

Conversation

anandthakker commented Jun 2, 2017 • edited Loading

Intent

Background:

Important use cases that the final design should cover:

Plan

Benchmarks

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

anandthakker Jun 6, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

1ec5 Jun 13, 2017 • edited Loading

Choose a reason for hiding this comment

ivovandongen commented Jul 4, 2017

anandthakker commented Jul 4, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jfirebaugh Jul 14, 2017 • edited Loading

Choose a reason for hiding this comment

anandthakker Jul 14, 2017 • edited Loading

Choose a reason for hiding this comment

jfirebaugh left a comment

Choose a reason for hiding this comment

jfirebaugh commented Jul 17, 2017

anandthakker commented Jul 18, 2017

Choose a reason for hiding this comment

anandthakker commented Aug 30, 2017

mourner commented Aug 30, 2017

ChrisLoer commented Aug 30, 2017

anandthakker commented Aug 30, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gabimoncha commented Jul 13, 2018

anandthakker commented Jun 2, 2017 •

edited

Loading

anandthakker Jun 6, 2017 •

edited

Loading

1ec5 Jun 13, 2017 •

edited

Loading

jfirebaugh Jul 14, 2017 •

edited

Loading

anandthakker Jul 14, 2017 •

edited

Loading